Release v0.3.24 by bradhe · Pull Request #76 · tower/tower-cli

bradhe · 2025-07-31T21:10:23Z

Updated local/remote support for llms calls.

…tion The goals of this PR was to allow developers to use LLMs across their development and production environments just by specifying a short model family name and without having lots of conditional statements handling differences in environments. E.g. calling llms("llama3.2") should obtain a reference to the "llama3.2:3b" model when working with ollama local inference and a reference to "meta-llama/Llama-3.2-3B-Instruct" when using Hugging Face Hub serverless inference. Differences in naming of models, and the fact that each model is actually a collection of dozen of versions that differ in number of parameters (distillation) and quantization techniques made the naming resolution a tricky task. Here are the new capabilities of the llms() API: 1. Tower now recognizes ~170 names of model families as of August 2025. 2. Users can specify a model family e.g. llms("deepseek-r1") in both local and Tower cloud environments, and Tower will resolve the model family to a particular model that is available for inference: - in local environment, Tower will find the model that is installed and, if there are multiple installed, pick the one with the largest number of parameters - in Tower cloud environments, Tower will take the first model returned by HF search, making sure that this model is servable by the Inference Service, if specified by users In addition to using model family names, Users can also specify a particular model both in local and Tower cloud environments: locally: llms("deepseek-r1:14b") or llms("llama3.2:latest") serverlessly: llms("deepseek-ai/DeepSeek-R1-0528") or llms("meta-llama/Llama-3.2-3B-Instruct") Expected use: A developer wants to use use a model of the "llama3.2" family first in development and then in production. They would add this code to their Tower app: ``` model_name=os.getenv("MODEL_NAME") llm = llms(model_name) ``` They would set up their environments as follows: "dev" environment: Tower Secrets: MODEL_NAME = "llama3.2" (any model of this family installed locally will do) TOWER_INFERENCE_ROUTER=ollama "prod" environment: Tower Secrets: MODEL_NAME = meta-llama/Llama-3.2-3B-Instruct (use a particular model) TOWER_INFERENCE_ROUTER=hugging_face_hub TOWER_INFERENCE_ROUTER_API_KEY=hf_123456789 TOWER_INFERENCE_SERVICE=<together|...>

…l-llm-feature-that-retrieves-best

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…that-retrieves-best' of https://github.com/tower/tower-cli into feature/tow-501-sdk-implement-experimental-llm-feature-that-retrieves-best

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

1. added HF Hub and Ollama as dependencies to pyproject.toml (not a brad comment) 2. bumped HF version to 0.34.3 3. updated pytest.ini.template with settings to find source code 4. in tower context and elsewhere, renamed inference_service to inference_provider 5. in llms, addressed bunch of comments from brad 6. removed test_env.py 7. addressed lots of brad comments re test_llms.py 8. updated uv.lock

…imental-llm-feature-that-retrieves-best Initial check-in of changes to llms() API improving model name resolution

Copilot

Pull Request Overview

This PR updates the LLM functionality to support both local and remote inference, replacing the previous hardcoded model mappings with a more flexible model resolution system that works with Ollama locally and Hugging Face Hub remotely.

Key Changes:

Refactored the LLM system to dynamically resolve model names instead of using hardcoded mappings
Added comprehensive test coverage for both local (Ollama) and remote (Hugging Face Hub) inference scenarios
Updated configuration to use generic inference router/provider terminology instead of Hugging Face-specific naming

Reviewed Changes

Copilot reviewed 10 out of 13 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
src/tower/_llms.py	Complete rewrite of LLM functionality with dynamic model resolution, expanded model family support, and unified inference interface
src/tower/_context.py	Updated context configuration to use generic inference terminology instead of Hugging Face-specific variables
tests/tower/test_llms.py	New comprehensive test suite covering model resolution and inference for both local and remote scenarios
tests/tower/test_tables.py	Fixed import path from tower.pyiceberg to pyiceberg
pyproject.toml	Updated dependencies to support new LLM functionality
pytest.ini.template	Added pytest configuration template
README.md	Updated testing documentation
.vscode/settings.json	Added VS Code pytest configuration
.github/workflows/build-binaries.yml	Added python3-dev package for building dependencies

Comments suppressed due to low confidence (1)

src/tower/_llms.py:6

[nitpick] The import alias ollama_list_models is misleading since list is the actual function name. Consider using a more descriptive alias like ollama_list or importing it directly as list and using qualified calls like ollama.list().

from ollama import list as ollama_list_models

Copilot · 2025-07-31T21:11:40Z

src/tower/_llms.py

+            model_family = model_name.split(':')[0]
+            size = model.get('size', 0)
+            details = model.get('details', {})
+            parameter_size=details.get('parameter_size', '')


Missing space after '=' in parameter assignment. Should be parameter_size = details.get('parameter_size', '') for consistency with other assignments in the same block.

Suggested change

parameter_size=details.get('parameter_size', '')

parameter_size = details.get('parameter_size', '')

Will fix in a future release.

Copilot · 2025-07-31T21:11:40Z

src/tower/_llms.py

+            size = model.get('size', 0)
+            details = model.get('details', {})
+            parameter_size=details.get('parameter_size', '')
+            quantization_level=details.get('quantization_level', '')


Missing space after '=' in parameter assignment. Should be quantization_level = details.get('quantization_level', '') for consistency with other assignments in the same block.

Suggested change

quantization_level=details.get('quantization_level', '')

quantization_level = details.get('quantization_level', '')

Will fix in a future release.

Copilot · 2025-07-31T21:11:41Z

src/tower/_llms.py

+        raise ValueError(f"Inference router {ctx.inference_router} not supported.")
+
+    if ctx.inference_router == "ollama":
+        return resolve_ollama_model_name(ctx,requested_model)


Missing space after comma in function call. Should be resolve_ollama_model_name(ctx, requested_model) for consistency with coding standards.

Suggested change

return resolve_ollama_model_name(ctx,requested_model)

return resolve_ollama_model_name(ctx, requested_model)

Will fix in a future release.

Copilot · 2025-07-31T21:11:41Z

src/tower/_llms.py

+    if ctx.inference_router == "ollama":
+        return resolve_ollama_model_name(ctx,requested_model)
+    elif ctx.inference_router == "hugging_face_hub":
+        return resolve_hugging_face_hub_model_name(ctx,requested_model)


Missing space after comma in function call. Should be resolve_hugging_face_hub_model_name(ctx, requested_model) for consistency with coding standards.

Suggested change

return resolve_hugging_face_hub_model_name(ctx,requested_model)

return resolve_hugging_face_hub_model_name(ctx, requested_model)

Will fix in a future release.

Copilot · 2025-07-31T21:11:41Z

src/tower/_llms.py

+    - quantization_level: quantization level if specified
+    """
+    try:
+        models = ollama_list_models()


The function ollama_list_models is imported as an alias for list from ollama, but list expects no arguments while this call passes none. However, the ollama list() function should be called to get the models. This may cause a runtime error if the function signature doesn't match expectations.

Will fix in a future release.

datancoffee and others added 12 commits July 27, 2025 17:59

Merge branch 'develop' into feature/tow-501-sdk-implement-experimenta…

03a0a25

…l-llm-feature-that-retrieves-best

Update src/tower/_llms.py

a3fd69d

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Apply recos from copilot review

a6d414c

Merge branch 'feature/tow-501-sdk-implement-experimental-llm-feature-…

b1e83d4

…that-retrieves-best' of https://github.com/tower/tower-cli into feature/tow-501-sdk-implement-experimental-llm-feature-that-retrieves-best

Added Cursor settings.json for Pytest

04b7a51

Update src/tower/_llms.py

9a6550d

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Marking pytests as skipped because they are not runnable in GH actions

0970e5c

chore: Try to resolve build issue in Linux

8a2f83b

chore: One more missing dependency

88ad352

Merge pull request #74 from tower/feature/tow-501-sdk-implement-exper…

0929abe

…imental-llm-feature-that-retrieves-best Initial check-in of changes to llms() API improving model name resolution

bradhe requested review from Copilot, datancoffee, konstantinoscs and socksy July 31, 2025 21:10

bradhe changed the title ~~Production deployment~~ Release v0.3.24 Jul 31, 2025

chojre: Bump version to v0.3.24

106eeaa

Copilot AI reviewed Jul 31, 2025

View reviewed changes

bradhe merged commit bc0c8ee into main Jul 31, 2025
21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release v0.3.24#76

Release v0.3.24#76
bradhe merged 13 commits intomainfrom
develop

bradhe commented Jul 31, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jul 31, 2025

Uh oh!

bradhe Jul 31, 2025

Uh oh!

Copilot AI Jul 31, 2025

Uh oh!

bradhe Jul 31, 2025

Uh oh!

Copilot AI Jul 31, 2025

Uh oh!

bradhe Jul 31, 2025

Uh oh!

Copilot AI Jul 31, 2025

Uh oh!

bradhe Jul 31, 2025

Uh oh!

Copilot AI Jul 31, 2025

Uh oh!

bradhe Jul 31, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	parameter_size=details.get('parameter_size', '')
	parameter_size = details.get('parameter_size', '')

	quantization_level=details.get('quantization_level', '')
	quantization_level = details.get('quantization_level', '')

	return resolve_ollama_model_name(ctx,requested_model)
	return resolve_ollama_model_name(ctx, requested_model)

	return resolve_hugging_face_hub_model_name(ctx,requested_model)
	return resolve_hugging_face_hub_model_name(ctx, requested_model)

Conversation

bradhe commented Jul 31, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Key Changes:

Reviewed Changes

Uh oh!

Copilot AI Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

bradhe Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

bradhe Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

bradhe Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

bradhe Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

bradhe Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants